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I  Foreword 


This  is  the  final  report  for  the  ARO  MURI  Sequential  Adaptive  Multi-Modality  Target  Detection 
and  Classification  using  Physics-Based  Models,  Grant  number  DAAD 19-02- 1-0262  over  the 
period  July  2002  through  Sept.  2006 

The  MURI  has  supported  four  co-PI's,  ten  graduate  students,  and  two  post-doctoral  fellows. 
Significant  research  progress  has  been  made  in  this  MURI.  The  MURI  produced  fifty 
publications,  five  PhDs,  and  four  MSs  in  the  areas  of  adaptive  sensor  management,  image 
processing,  and  electromagnetic  modeling.  Several  of  the  publications  coming  out  of  this  MURI 
have  won  best  paper  awards.  Technology  transfer  to  General  Dynamics,  Raytheon,  and  to  Army 
Night  Vision  Electronics  Laboratory  has  occurred  -  several  of  the  students  supported  on  this 
MURI  are  now  employees  at  these  facilities  and  relevant  software  has  been  released  to  them. 

We  offer  these  accomplishments  as  evidence  of  the  success  of  this  MURI.  In  this  report  we 
highlight  these  accomplishments. 


II  Statement  of  the  problem  studied 

The  focus  of  this  MURI  was  to  formulate  implementable  sequential  detection,  sensor 
management  &  selection  strategies  that  could  be  applied  to  detection  of  mines,  tracking 
stationary  and  moving  targets  under  foliage  and  other  attenuation  and  obscuration  sources,  and 
facilities  detection  and  imaging.  A  key  aspect  of  our  project  was  to  account  for  the  physics  of 
wave  propagation  and  wave-target  interaction  to  build  tractable  low  dimensional  models. 
Another  key  aspect  was  the  derivation  of  bounds  and  approximations  to  optimal  sensor 
scheduling  performance,  as  specified  by  partially  observable  decision  processes  (POMDP)  and 
greedy  myopic  sensor  management  algorithms. 


Ill  Summary  of  the  most  important  results 

The  highlights  of  this  MURI  were  development  of  strategies  for  the  following  area  (student’s 
associated  with  these  projects  are  in  parentheses): 

1 .  Theory  and  application  of  information  gain  (IG)  scheduling  (Kreucher) 

2.  Classification  reduction  for  reinforcement  learning  (RL)  and  POMDP  (Blatt,  Marble) 

3.  Optimal  energy  scheduling  for  detection  and  estimation  of  scatters  (Rangarajan) 

4.  Approximation  and  modeling  of  EM  scattering  and  propagation  in  inhomogeneous  media 
(Koh&Wang). 


1.  Theory  and  application  of  information  gain  (IG)  scheduling 

It  is  well  known  that  on-line  scheduling  of  sensor  actions  can  give  major  gains  in  performance 
relative  to  off-line  scheduling.  For  example,  in  mine  detection  Larry  Carin’ s  group  has  shown  a 
factor  of  greater  than  2  improvement  in  number  of  correct  detections  for  fixed  false  alarm  rate. 
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Many  researchers,  including  Carin,  have  used  entropic  measures  such  as  information  gain  as  a 
scheduling  criterion.  Information  gain  scheduling  has  advantages  of  simplicity  and  lack  of 
dependence  on  the  particular  task  -  e.g.,  one  can  optimize  the  same  criterion  for  detection, 
classification  or  tracking.  Furthermore,  simulations  by  us  and  others  have  shown  low  sensitivity 
to  model  mismatch  and  near  optimality  in  terms  of  minimizing  task-dependent  risk.  Such 
properties  are  important  since,  for  example,  one  can  be  assured  that  if  one  schedules  sensors  for 
mine  detection  this  schedule  would  also  be  near  optimal  for  identification  of  mine  type. 

However,  until  now  there  has  not  been  any  theory  to  back  up  the  empirically  observed 
insensitivity  of  information  gain.  Nor  has  there  been  a  simple  way  to  relate  optimal  scheduling  to 
classification  to  facilitate  design  of  optimal  schedules.  Finally,  in  the  real  multitarget  tracking 
context  no  one  has  been  able  to  implement  scheduling  for  more  than  a  few  targets.  We  have 
made  significant  progress  in  these  directions. 

Introduction  of  Renyi  information  divergence  gain:  We  introduced  a  new  measure  of 
information  gain,  called  Renyi  information  divergence,  and  established  that  it  has  several 
advantages  over  other  measures  such  as  mutual  information,  entropy,  or  Fisher  information 
(Kreucher&etal:IPSN03).  Principal  among  these  is  a  universality  result  that  any  risk  function  can 
be  sandwiched  between  the  Renyi  alpha  information  gain  for  two  different  values  of  alpha 
(Kreucher&etal:  CDC05).  This  implies  that  the  information  gain  is  a  universal  surrogate  in  a 
theoretically  precise  sense.  We  also  showed  that  the  Reny  divergence  is  simply  computed  under 
the  multi-target  particle  filtering  model  (Kreucher&etal:  AES05)  for  the  information  state  and 
that  it  is  more  robust  to  mismodeling  errors  than  other  information  measures  (Kreucher&etal: 
SP05).  The  details  of  modeling  and  implementation  of  Renyi  information  gain  scheduling  for 
multiple  target  tracking  are  discussed  in  (Kreucher:  Thesis05). 


Combined  particle  filtering  and  reinforcement  learning  for  managed  multi-target  tracking: 

We  leveraged  on  a  mature  methodology  in  the  machine  learning  literature,  known  as  relevance 
feedback  learning,  to  schedule  deployment  of  a  suite  of  agile  sensors  (Kreucher&etal:  SP05). 

Our  method  for  managing  agile  sensors  learns  the  number  and  states  of  a  group  of  moving 
targets  occupying  a  surveillance  region.  The  system  computes  a  sensing  action  to  take  based  on 
the  Renyi  divergence.  A  measurement  is  made,  providing  relevance  feedback  and  the  system 
updates  its  probability  density  on  the  number  and  states  of  the  targets.  This  procedure  repeats  at 
each  time  where  a  sensor  is  available  for  use.  Due  to  the  difficulty  in  computing  the  probability 
updates  we  have  adopted  a  Bayesian  Monte  Carlo  approach  using  particle.  We  have  shown 
(using  simulated  measurements  on  real  recorded  target  trajectories)  that  this  method  of  sensor 
management  yields  a  ten-fold  gain  in  sensor  efficiency  when  compared  to  standard  unmanaged 
exhaustive  scanning.  This  method  has  also  been  applied  to  the  more  difficult  case  where  targets 
may  stop  and  start  or  change  dynamics  in  some  other  way  using  multiple  model  selection  and 
hidden  Markov  state  estimation  (Kreucher&etal:  ASAP04). 


2.  Classification  reduction  for  reinforcement  learning  (RL)  and  POMDP 

Reinforcement  learning  and  POMDP  approaches  have  been  widely  adopted  for  optimal 
scheduling  and  sensor  management.  One  major  difficulty  in  practical  implementation  of  these 
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approaches  is  that  the  optimal  scheduling  policy  is  a  complicated  function  of  the  distribution  of 
the  observations  under  each  sensing  action  and  under  each  hypothesis  (for  example  mine  or  no 
mine).  The  common  approach  is  to  approximate  the  distribution  and  then  use  this  approximation 
to  estimate  the  average  reward  under  various  sensor  actions  by  collecting  a  large  amount  of 
training  data.  The  estimated  optimal  schedule,  or  policy,  is  obtained  by  maximizing  the  estimated 
rewards.  A  largely  open  problem  is  the  question  of  generalization  error:  how  many  training 
samples  does  one  need  to  guarantee  that  the  estimated  optimal  policy  will  perform  near  optimally 
on  the  test  samples?  We  made  solid  progress  towards  this  objective  by  deriving  bounds  (Blatt: 
Thesis06)  on  the  generalization  error.  These  bounds  can  be  used  to  predict  the  number  of 
samples  needed  for  a  given  model  class,  e.g.  Gaussian  distributed  data.  The  method  used  to 
obtain  these  bounds  is  of  interest  in  its  own  right  since  it  converts  the  sensor  scheduling  problem 
into  a  sequential  classification  problem.  As  shown  in  (Blatt&Hero:  NIPS05)  this  classification 
problem  can  then  be  solved  using  off-the-shelf  classifiers  such  as  radial  basis  functions,  SVM,  or 
kNN  classifier  structures.  When  applied  to  mine  detection  we  obtain  a  performance  curve  that 
demonstrates  the  advantage  of  using  non-myopic  (2  stage)  scheduling  for  deploying  one  of  three 
confirmation  sensors  (GPR,  EMI,  or  Seismic)  for  mine  detection  (Blatt&Hero:ICAPS06). 

3.  Optimal  adaptive  detection  and  estimation  of  scatters 

Active  radar  waveform  selection  is  an  important  problem  that  allows  a  sensing  system  to  make 
optimal  use  of  finite  resources.  For  example,  when  the  objective  is  imaging  of  a  random  medium 
with  energy  constraints,  Papanicolau  has  shown  that  there  is  an  optimal  offline  sequence  of 
probing  frequencies  that  depends  on  whether  the  objective  is  detection  or  image  reconstruction. 
We  have  established  results  in  different  directions:  optimal  waveform  selection  in  a  predictive 
POMDP  setting  and  optimal  energy  allocation  for  detection  and  image  reconstruction. 

Optimal  energy  scheduling:  In  (Rangarajan:  SSP05)  we  resolved  an  open  question:  can  one 
adaptively  allocate  energy  over  multiple  dwells  of  a  radar  and  achieve  significant  gains  in 
estimation  or  detection  performance  for  inferring  the  scatter  medium?  In  particular,  even  when 
the  radar  is  restricted  to  2  dwells,  we  established  that  with  for  adaptive  energy  allocation  can 
achieve  almost  30%  improvement  of  estimation  accuracy  relative  to  non- adaptive  allocation.  The 
fixed  energy  allocation  concentrates  all  energy  into  single  transmission  during  each  signal 
period.  The  adaptive  allocation  takes  advantage  of  the  fact  that  if  energy  is  divided  over  the 
signaling  interval  a  small  amount  of  energy  may  be  sufficient  to  detect  the  presence  of  a  strong 
scatterrer,  allowing  one  to  keep  energy  in  reserve  for  weaker  and  more  ambiguous  scatterers. 
Under  a  Rayleigh  scattering  model  (Born  approximation)  the  optimal  strategy  turns  out  to  be  to 
break  up  the  energy  into  quanta  and  transmit  a  little  energy  initially,  deciding  to  transmit  more 
energy  only  if  the  first  quanta  generates  a  “good”  observation  (one  with  high  instantaneous 
SNR).  The  characteristic  “ramp”  shape  of  the  optimal  energy  allocation  in  time  mimics  the  chirp 
type  of  waveform  in  frequency  that  is  common  in  adaptive  radar  waveform  design.  A  journal 
paper  outlining  our  theory  was  recently  accepted  in  the  inaugural  issue  of  the  IEEE  Journal  on 
Selected  Topics  in  Signal  Processing  (Special  issue  on  waveform  design)  (Rangarajan&etal: 
JSTSP07). 
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Adaptive  multichannel  waveform  selection  and  design.  In  (Rangarajan&etal:  ICASSP06)  we 
developed  a  significantly  different  approach  to  waveform  selection  than  what  has  been 
previously  proposed.  The  main  challenge  for  multichannel  adaptive  waveform  selection  is  the 
curse  of  dimensionality:  as  the  number  of  channels  increases  the  number  of  possible 
combinations  of  waveforms  that  can  be  transmitted  through  the  set  of  channels  increases 
exponentially.  Thus  exhaustive  search  methods  for  selection  are  not  tractable.  Furthermore,  in  an 
on-line  sensor  management  setting  decisions  about  the  best  waveforms  to  deploy  at  the  next 
radar  dwell  can  only  depend  on  past  measurments.  We  developed  a  method  based  on  ensemble 
learning  and  generalized  additive  models  (GAM)  that  breaks  the  exponential  complexity  logjam 
without  appreciable  loss  in  performance.  This  was  demonstrated  for  radar  tracking  of  a  target 
with  non-linear  (two  state)  dynamics. 


4.  Approximation  and  modeling  of  EM  scattering  and  propagation  in  inhomogeneous 
media 

A  hybrid  full-wave  and  single  scattering  theory  model  was  completed  that  can  handle  scattering 
of  a  hard  target  in  a  random  medium.  The  model  can  compute  scattering  from  the  hard  target  in 
an  exact  manner,  and  accounts  for  first-order  near-field  interaction  foliage  and  the  hard  target.  A 
journal  paper  was  just  submitted  on  the  topic.  We  have  also  completed  a  comprehensive  foliage 
and  hard  target  model,  including  near-field  interactions  for  high  frequencies.  A  journal  paper  was 
just  submitted  on  the  topic.  We  have  also  completed  various  foliage  attenuation  models.  We  have 
also  completed  a  study  on  multi-polarization  camouflaged  target  detection.  A  journal  paper  was 
just  submitted. 
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